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Introduction 

The way by which proteins can find their target 
sites along a DNA chain represents a puzzling 
problem. In many cases, the reaction rate has 
been demonstrated to be faster than diffusion 



controlled (Riggs et aL, 197C ; Berg et al., 1981 



Reich and Mashhoon, 1991; Surby and Reich 
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Abstract 

We introduce a probabilistic model for protein 
sliding motion along DNA during the search of 
a target sequence. The model accounts for pos- 
sible effects due to sequence-dependent interac- 
tion between the nonspecific DNA and the pro- 



tein. As an example, we focus on T7 RNA- 



ertheless, a precise experimental determination 
of the statistical law characterizing the diffusion 
motion of protein along DNA during the specific 
site search is presently lacking. It is believed 
that during the sliding motion, the activation 
barrier for the translocation of the protein to 
continuous nonspecific positions is high enough 
to randomize the protein motion through colli- 
sions with the solvent water, but appropriately 
small compared to the thermal energy, in order 
to allow th e protein to move ( von Hippel and 
Berg, 1989 ). This has induced some authors 



polymerase and exploit the available informa- 
tion about its interaction at the promoter site 
in order to investigate the influence of bac- 
teriophage T7 DNA sequence on the dynam- 
ics of the sliding process. Hydrogen bonds 
in the major groove are used as the main 



to propose a model where protein freely slides 
along DNA under the effect of the thermal fluc- 
tuations without any sequence dependent inter- 
action, i.e., the DNA is seen as an homogeneous 
cylinder on which the protein can diffuse until 
the specific site is reached (von Hippel and Berg 



sequence-dependent interaction between RNA- 1989j|yon Hippel et aL, 1996| ; [Park et al., 1982b ) 



polymerase and DNA. The resulting dynamical 
properties and the possibility of an experimen- 
tal verification are discussed in details. We show 
that, while at large times the process reaches a 
pure diffusive regime, it initially displays a sub- 
diffusive behavior. The subdiffusive regime can 
lasts sufficiently long to be of biological inter- 
est. 



During sliding, however, the protein must be 
able to distinguish the specific region from non- 
specific DNA so that a recognition mechanism 
must be involved. To this regard, the possibil- 
ity that sliding could imply sequence dependent 
protein-DNA interaction is rather reasonable. 

The aim of the present paper is to investigate 
this idea in the context of a simple probabilis- 
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tic model for RNA-polymerase (RNAP) sliding 
along DNA, which accounts for the sequence- 
dependent interaction between the nonspecific 
DNA and the enzyme. As an illustrative ex- 
ample we consider the case of the T7 RNA- 
polymerase sliding on the bacteriophage T7 
DNA ( punn and Studicr, 1983| ). Although the 
results of the paper are likely to be valid also for 
other enzymes, the T7 RNAP has several advan- 
tages which are suitable for our modelling. In 
particular, we mention the simplicity of the en- 
zyme, (it is a small enzyme (100 kDa) composed 
of only of one unit, and recognizes a single asym- 
metric region on the DNA), and the availability 
of high resolution crystal structure data both for 
the RNAP alone and for the RNAP bound to 



his promoter (Jeruzalmi and Steitz, 199S; Chee 



tarn et ai, 1999 ; Chcctham and Steitz, 1999 ). 
In contrast to more complicated enzymes such 
as lac repressor (Winter et ai, 1981), restric- 
tion cndonuclease (EcoRI ( [Jack et al, 1982 ; 
Ehbrccht et al, 1985|), EcoRV ( |Dowd and Lloyd, 



1990|; [Stanford et al, 2000b])), methyl trans 



dependent diffusion motion of the RNAP along 
the DNA, we define a base sequence energy land- 
scape from which hopping rates of the enzyme 
on the DNA (view as a discrete inhomogeneous 
lattice) can be deduced. Since only limited ex- 
perimental knowledge exists about nonspecific 
DNA-protein interaction, we shall use informa- 
tion about sequence dependent RNAP-DNA in- 
teraction inside the promoter region and extrap- 
olate it to nonspecific regions. The diffusive mo- 
tion of the RNAP is then studied by Monte- 
Carlo simulations of the probabilistic process 
on the landscape energy both in absence and 
in presence of thresholds which define differ- 
ent rules for the hopping motion. As a result 
we show that while at large times the process 
reaches a pure diffusive regime, at the initial 
stage it displays a sub-diffusive behavior. It is 
remarkable that the anomalous diffusion regime 
can last for time large enough to be observ- 
able in single molecule experiments similar to 
those that have permitted to visualize sliding for 
the E. coli R NAP (|Kabata et al, 1993| ; [Harada 



ferase (EcoRI (|Surby and Reich, 1996|)), E. coli et al, 1999[ |Guthold et al, 1999| ). Singule 



RNA polymerase (Park et al, 1982b), etc., no 
direct evidence of diffusive sliding motion has 
been presented for T7 RNAP. However, the fact 
that the enzyme is able to locate his promot- 
ers inside about 40000 base pairs DNA during 
a time much shorter than what a three dimcn- 



molecule experiments on T7 RNAP are indeed 



underway in several laboratories (Hcslot, 2005 
Baumann, 2002| ; [Place, 2002] ). We remark that 



base sequence induced dynamics along DNA was 
also considered in Ref. (Salerno, 1991, 1995| ) 



in connection with a nonlinear model of DNA, 



sional search would require ( Endy et al, 200C ) , 
strongly suggests a sliding mechanism also in 
this case. In our model we assume therefore 
that T7 RNAP proceeds by sliding during the 
promoter search. The model is based on the idea 
that the RNAP needs to "read" the underlying 
sequence during sliding in order to test whether 
special "signals" associated with the promoter 
are present, i.e., a sequence-dependent interac- 
tion should be at work during the search. This 
means that the DNA sequence can influence the 
dynamics of the polymerase also far from the 
promoter. In this sense, the stop at the pro- 
moter should be the extreme effect of a com- 
plex dynamics, i.e., the RNAP should follow 
a noise-influenced, sequence-dependent motion 
that includes the possibility of slowing down, 
pauses and stops. From this point of view the 
usual assumption of a standard random walk of 



the RNAP along 


DNA (Berg et al, 1981; 


Ka- 


bata et al, 1993; 


Harada et al, 1999|; puthold 


et al, 1999; 


Stanford et al, 2000a) appears in- 



adequate. 

To investigate the possibility of a sequence- 



and in Ref. ( Jiilichcr and Bruinsma, 1997 ) in 
connection with the RNAP motion during the 
transcription process. 

The paper is organized as follows: in Section 
|l] we use some known data on the T7 RNAP- 
promoter complex to introduce a sequence de- 
pendent model for the RNAP-DNA nonspecific 
interaction. An energy landscape with min- 
ima corresponding to the recognition sequence 
is constructed. We then introduce four possi- 
ble models for the RNAP diffusive motion along 
the DNA by using the sequence induced energy 
landscape and its modification as the inclusion 
of energy thresholds, which allow to describe dif- 
ferent possible reading mechanisms. The rate 
of translocation to the neighboring sites is con- 
structed from the energy landscapes (for the dif- 
ferent models) by means of the Arrhenius law. 
In Section || we use Monte-Carlo simulations to 
study in detail the different dynamical regimes 
of our models. Finally, we discuss in section 
[3] the limits of our analysis and the possibility 
to check the results with experiments, so as to 
verify if the inferred mechanism actually cor- 
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responds to the real one. Then we draw our 
Conclusions. 



1 Methods: experimen- 
tal data and theoretical 
model 

1.1 T7 RNAP - DNA interaction 
and promoter recognition 

The stability of the RNAP-DNA nonspecific 
complex is mainly due to electrostatic interac- 



dependent interaction is compatible with the ex- 
perimental data. 

The first point to address, in order to 
have a suitable description of the promoter 
search dynamics, is therefore to determine which 
sequence-dependent interaction is responsible of 
the promoter recognition by the T7 RNAP. Ex- 
perimental results seem to indicate, as we will 
now discuss, that a specific set of hydrogen 
bonds on the 5 bps sequence GAGTC represents 
the main recognition core. We will therefore use 
this set of bonds as the main recognition tool in 
our model. Biochemical and structural analysis 
gives a very precise information on the princi- 



tion with the backb one phosphate of DNA (|von| plcofpromoter recognition (|McAllister, 1997 



Hippel et al, 1996) and to the entropic release 


of cations (deHaseth et al, 1977 


; sidorova and 


Rau, 2001 




Singer and Wu, 1988 


). For spe- 



cific interaction, while ionic effect could still be 



present (Record et al, 1977), the major stabi- 
lization effect arises from the release of water 



molecules (Sidorova and Hau, 2UU1 ). The pres- 
ence ot a layer ot water between protein and 
DNA in nonspecific complex weakens the spe- 
cific interaction. This suggests that a contin- 
uous variation between specific and nonspecific 
binding exists ( Jcltsch et al, 1994 ); the transi- 
tion from nonspecific to specific complex can be 
induc ed by conformational cha nges of the pro- 
teins ( [Bpolar and Record, 1994| ). 

Besides these stabilizing factors, sequence- 
dependent interaction allows the RNA poly- 
merase to test the DNA during the promoter 
search (Travers, 1993). Experimental data on 
cndonuclease EcoRI show that pausing of the 
protein during sliding occurs at sites which 



resemble the specific sequence (Jcltsch et al. 
1994| ). Thus, the nonspecific "reading" should 



be of the same nature as the specific recogni- 
tionQ From this observation one can deduce 
that at least some of the different kinds of in- 
teraction observed in the specific complex could 
be already present during sliding, and might 
be used in the recognition mechanism. This 
hypothesis can be interesting also if the ac- 
tual reading mechanism is not exactly the same 
but a similar type; the study of the consequent 
dynamics may help, from a general point of 
view, in understanding which kind of sequence- 



1 Also remark that, for the case of CRP protein, non- 
specific binding have been prop osed to mimics the spe- 
cific, c- Amp-dependent binding (Katouzian-Safadi et al. 



1993), this confirming the hypothesis ot a continuity be- 
tween nonspecific and specific recognition interaction. 



Souza, 1997) and on the polymerase-promotcr 
specific complex for the case of bacteriophage 
T7 dCheetam et al, 1999[ ). T7 RNAP recog- 
nizes a 23 bps promoter, that extends from - 
17 to +6 relatively to the initiati on site and 



consi sts of two functional domains ( McAllister 



1997 ). It is reasonable to assume that the initi- 
ation domain, extending from -4 to +6, does 
not interfere directly in the promoter search: 
a measure of the dissociation constant with 
small oligonucleotides carrying truncated pro- 
moter have shown indeed that these base pairs 
do not participate in the promoter recognition 
(Ujvari and Martin, 1997). Let us then consider 
the binding domain (from -17 to -5). 
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Figure 1: The positions of all the possible ma- 
jor groove interacting sites where base-pairs can 
make hydrogen bonds ( top ) and the correspond- 
ing base-pair patterns (bottom). Blue and red 
disks indicates the hydrogen donor and accep- 
tor DNA groups respectively. White positions 
correspond to hydrogen atoms and yellow ones 
to methyl groups. Each base-pair is associated 
with a different 1x4 pattern. 
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Different biochemical studies, together with 
a recent crystallographic analysis, contribute to 
the determination of the most relevant base 
pairs in this region. On one hand, a hierar- 
chy of base pairs preferences was determined by 



single points mutations in the promoter (Chap 



man and Burgess, 1987; Diaz et at, 1993; Im- 
burgio et al, 2000| ). These studies have shown 
lower sequence-sensitivity of the region -17 to 
-12: the specificity arises from bases -11 to -5, 
being more stringent on bases -7 to -9. The iden- 
tification of the functional group of the DNA 
involved in those potential contacts shows that 
direct contact in the recognition region -11 to 
-5 arise mostly through the major groove of 



a double strand promoter (Schick and Martin 
19951; Fj et al, wM). On the other hand, 



the crystal structure of T7 RNAP bound to its 
promoter (Chcctam et al., 1999) is consistent 
with these biochemical studies ( [imburgio et al 



200C) and draws a structural picture of the T7 
RNAP promoter interaction^} In particular, a 
set of sequence-specific bonds between protein 
side chains and bases in the major groove arise 
in the region -11 to -7, via the formation of hy- 
drogen bonds with the appropriate acceptor or 
donor chemical groups in the base pairs sides 
(See Fig. and Fig. §). 

We remark that the previously mentioned ki- 
netic studies suggest that base pairs -5 and -6 
also c an contribute to the recognition mecha- 
nism ( Li et al., 1996 ): these contacts are prob- 
ably lost once the open complex is formed, 
so that the mentioned crystallographic analysis 
does not show them. Anyway, the base speci- 
ficity appears to be less stringent for these two 
contacts too ( Li et al., 1996| ). Because of their 
weak specificity, we will neglect these two in- 
teracting base pairs, and focus here just on the 
hydrogen bond mediated interaction arising on 
bases -11 to -7, that has the strongest sensitivity 
to the base pairs. The question addressed will 
be therefore how the specific interaction of this 
5 bps region influences the polymerase motion. 

Hydrogen bond acceptors and donors are reg- 
ularly positioned on the promoter major groove: 
the DNA geometry is in fact such that each of 
the four different base pairs exposes four possi- 
ble major groove interacting sit es as depicted 
in Fig. [l] flBccman et at, 197£ ). These sites 



can be either H-bond acceptors and donors, or 



< I s I JO I -7 

j _ i 

VtI -b 

• I 

s ".' °*Tcl _ 9 

i . 
V (J~£° f*T"> -io 

3' ' I 5' 

r 

i -, . 

($.'' 'yO-ll 

template non template 




□ 



interactions 
H-bond (dna donor) 
H-bond (dna acceptor) 



B i 



/2 H-bond (dna acc . ) 
a water molecule 



2 See Figs. 1 and 2 in Ref. ( Cheetam et al, 1999 ) for 
i clear representation of the whole set of interactions. 



Figure 2: A sketch of the DNA interaction sites 
at the promoter, where hydrogen bonds with 
corresponding RNAP chemical groups are made. 
Blue and red disks indicate the hydrogen donor 
and acceptor DNA groups respectively; the two 
half disks correspond to a couple of sites that 
could share a water mediated hydrogen bond. 
On the right, the corresponding 5x4 pattern 
that RNAP recognizes. 



sites where a hydrogen atom or a methyl group 
are present. In the latter case they do not 
bond directly to polymerase (at least at the pro- 
moter site). Fig. g depicts the H-bonds actually 
made between polymerase and DNA at the pro- 
moter, as revealed by the crystallographic analy- 
sis. The two semicircles in the left part of Fig. ^ 
and their correspondent positions on the right 
pattern refer to the presence of a hydrogen bond 
which is shared be tween two DNA sites through 
a water molecule ( |Cheetam et al., 1999] ) . 

We shall assume that, in each position along 
DNA, the RNAP "tries" to make the same set 
of hydrogen bonds as at the promoter, testing 
in this way the underlying sequence. It is con- 
venient to represent the RNAP by a recogni- 
tion matrix able to match its target sequence, 
i.e., containing the pattern of active chemical 
groups that allows for the best binding at the 
promoter. We suppose therefore that each po- 
sition along DNA will have a certain number 
of made (matches) and unmade (mismatches) 
hydrogen bonds with the polymerase recogni- 
tion matrix. For simplicity, we will represent 
the recognition pattern directly in terms of its 
corresponding binding sites on DNA0. 



'This is also a way to remind that we actually do 
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One expects that each match will stabilize the 
complex, while mismatches will act as to desta- 
bilize RNAP, that will tend therefore to move 
away from the "wrong" positions ( von Hippcj 



and Berg, 1989; van Hippel et al, 1996). For 



each position n along the chain we define an 
energy E(n), simply by counting the number 
of matches and mismatches, and adding a cor- 
responding negative or positive amount of en- 
ergy, respectively (empty sites in the recogni- 
tion matrix do not contribute to the energy). 
Interacting sites corresponding to the semicir- 
cles in Fig. H are evaluated in a first approxima- 
tion as half hydrogen bonds everywhere along 
the chain. 

Formally, the energy is defined by denoting 
by +1,-1,0 respectively the acceptor, donor, 
and noninteracting DNA sites. The DNA se- 
quence is then represented as a list of vectors, 
...&„_i, &„,&„+!..., where 

(1,-1,1,0) T for base A 
(0, 1, -1, 1) T for base T 
(1,1, -1,0) T for base G 
(0,-l,l,l) T for base C 

The polymerase acts, at position n, on the se- 
quence of 5 bases that is represented by the 
4x5 matrix D n =(&„, & n+ i,& n+2 , b n+3 , b n+i ). 
The consensus sequence GAGTC at the pro- 
moter site corresponds therefore to the matrix 



D, 
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We then define a 4 x 5 recognition matrix 
R(i,j), corresponding to Fig. |[ 



/ 1 
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1 -1 
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1/2 1 / 



where the factors 1/2 have been introduced in 
order to reproduce the shared hydrogen bond, 
previously mentioned. With this notation, the 
interaction energy can be written simply as 



E(n) = - e tr(R ■ D n ) 



(1) 

not include in the model all the possible polymerase- 
DNA nonspecific bonds, but only those that are made 
at the promoter, for which an experimental evidence is 
available. 



where the dot ■ denotes the usual matrix multi- 
plication and tr is the trace. Minima correspond 
to the complete matching and thus to the recog- 
nition sequence GAGTC. Each positive or neg- 
ative contribution to the energy, e, is equal to 
a hydrogen bond energy. Note that the mobil- 
ity of RNAPs dramatically depends on e/ksT. 
At room temperature, fceT is about 0.025 eV 
(or RT= 0.6 kcal/mol, R = N a ks); the energy 
barriers must be smaller in order to allow the 
RNAP to move and reach the promoter site^. 
Since there are no direct measurements of the 
interaction energies during sliding and it is diffi- 
cult to make an estimate of the involved hydro- 
gen bond strength, we shall use e/fcgT as a free 
parameter. The resulting energy E{n) defines 
an irregular landscape on which the RNAP can 
move as it will be discussed in the next subsec- 
tion. 



1.2 Sequence dependent 
diffusive model 



RNAP 



We shall introduce in this subsection four ver- 
sions of the model describing different mecha- 
nisms of the fundamental translocation step in 
the enzyme motion. The length of hydrogen 
bonds (up to 3.5 A in DNA -protein interaction 
( Nadassy et al, 1999| )) can roughly reach the 
same order of magnitude as the distance be- 
tween base pairs (3.4 A). Therefore, the RNAP 
may eventually shift directly from one position 
to the next one without activation energy for the 
one step process. On the contrary, if the RNAP 
has to disrupt partially or completely the hy- 
drogen bonds on one site before moving to the 
next position, it has to overcome an additional 
activation barrier. 

Furthermore, RNAP could have some internal 
flexibility allowing for conformational changes, 
eventually depending on the local degree of sta- 
bility: i.e., it is possible that, if too many mis- 
matches are found, RNAP should undergo a 
conformational change from a "reading" mode 
to a "sliding" mode, where no hydrogen bonds 
are effectively made ( von Hippel et at, 1996 ) . In 
this case, one has a sort of two-states model, for 

4 This may seem not consistent with the usual mea- 



nonds, whic 



sured strength of chemical hydrogen 
mallv correspon ds to a few kcal/mol (Stryer, 1995 



Voet 



and Voet, 1995). The distance and orientation of the 
hydrogen bonds, however, together with their net ener- 
getics due to the interaction with the solvent, may be 
responsible of a relevant lowering of the interaction en- 
ergy. 
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which, if the total energy E(n) is over a thresh- 
old E t , the system passes to a different state 
of constant energy E s i where RNAP can freely 
slide. 
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Figure 3: A schematic picture of the four con- 
sidered variants of the model. On the horizon- 
tal axis, we represent a few (30) positions along 
DNA. Correspondingly we sketch the interac- 
tion energy E varying between its minimum 
(E m ) and its maximum {Em) values. The inter- 
action energy evaluated on the T7 DNA present 
similar rapid oscillations between different lev- 
els. The dotted lines indicate the threshold level 
E t , set to Em for model II, to an intermediate 
value for model III and TV. In the case of model 
TV, all energy levels above the threshold are re- 
defined to a common value E s i (dashed line). 



To account for all these possibilities, we define 
and analyze some different models, sketched in 
Fig. H and listed hereafter: 

I) no-threshold model (Fig. ^, 7): hydrogen 
bonds can directly translate from one po- 
sition to another without being destroyed. 
In this case the energy difference AE n ^ n > 
from n to n' = n ± 1 is simply 



energy E t . One has therefore 
AE n ^ n , = max[E t - E(n),E(n') - E(n),0] 

Models I and II are actually the two limiting 
cases of model III when the threshold is set to 
the minimum and maximum values of the po- 
tential energy, respectively. These three models 

-"could therefore be considered as three cases of 
a unique model, just dependent on the choice 
of the energy threshold. We will anyway refer 
to these three cases as to models I, II and III 

/v in the following, for convenience. Note that in 
the general case of an intermediate threshold, 
the previous model gives two different possible 
regimes for the polymerase, because the energy 
profile is qualitatively different in regions where 
E(n) is greater or lower than E t . 

Finally, to account simultaneously for two 
possible regimes of the RNAP-DNA interaction 
mentioned above, we propose a fourth model as 
follows: 

TV) two-regimes model (Fig. |^, IV): a thresh- 
old energy E t separates "reading" regions, 
where the energy is E(n) < E t , from "slid- 
ing" regions, where no hydrogen bonds are 
made and the RNA polymerase can freely 
diffuse on a flat energy landscape, E(n) = 
E s i. Below the threshold, the barrier E t 
still affect the translocation as in case of 
model 777. For simplicity, we will fix the 
value of E s i to Em = m&x[E(n)]. In this 
case, one can redefine the energy as 



AE„ 



E(n) = 



E(n) 
E s i 



if E(n) 
if E(n) 



<E t 
>E t 



(3) 



max[E(n')-E(n),0]; (2) 



here AE n ^ n * is set to zero if E(n') — E(n) 
is negative, as usual. 



TT) m.nr.im.nl-thrpshnld model (Fig | TT) ■ in 



QXz 



and AE n ^ n i results to be defined as in case 
777. 

Note that our model IV interpolates between 
straight sequence-dependent walk (model 7) and 
the biological model of the promoter search pro- 
posed by v on Hippcl in Rcf. (|von Hippel and 
Berg, 1989|; |von Hippel et al, 1996|). The see 



der to reach a next site, RNAP must de- 
stroy all bonds and pass through a state of 
"total mismatch". In this case AE n ^ n i — 
Em — E(n), where Em = max[E(n)]. 

Ill) intermediate-threshold model (Fig. ^|, 777): 
in order to reach a next site, RNAP must 
destroy all bonds and pass through an inter- 
mediate "zero" state defined by a threshold 



nario suggested by von Hippel relies indeed on 
the idea that the specific interaction is "switched 
off" by a conformational change if too many mis- 
matches are present. In that picture, RNAP is 
more often in a "sliding" mode, where the spe- 
cific hydrogen bond interaction is inactive. A 
quantitative description of this mechanism can 
be obtained by the introduction of our model IV, 
where the varying threshold level E t accounts 



G 



for the degree of homology which leads to the 
supposed RNAP conformational change. 

The rates r n _> n ' of translocation between 
neighboring sites n and n' are, accord- 
ing to the Arrhenius law, proportional to 
exp (-AE n ^ n ,/k B T), where n' = n ± 1. The 
model includes a nonzero probability for the 
polymerase to stop at one position; the com- 
plete set of translocation rates reads therefore: 

r n ->n> = l/2exp(-AE n ^ n ,/k B T), 

n' = n ± 1 

— >n 1 T*n — >n+l T*n — *n— 1 ■ 

(4) 

In the case of flat energy landscape (AE n ^ n r — 
0) all the rates r n ^ n / are equal to 1/2, which de- 
fines a simple one-dimensional diffusion process 
with diffusion constant D = 1. 

If the discretization of length, x = In (£ = 
3.4 A is the base pair step), and time, t = rrn, 
is explicitly taken into account, then the dimen- 
sionless D given by the relation (n 2 ) — 2D m 
corresponds to a physical value of D£ 2 /t. In 
order to give a quantitative meaning to our re- 
sults we need an estimate for the (mean) time r 
required for each translocation step. The upper 
diffusion limit, D = 1, associated to a physical 
diffusion constant £ 2 /2t, would correspond to a 
free diffusion without any local trapping effect. 
Schurr ( gchurr, 1979 ) has estimated this upper 
limit of the one-dimensional diffusion constant 
of lac repressor sliding and rotating along DNA 
helix track to be Di ac = 4.5 10~ 9 cm 2 /s. The 
lac repressor was approximated by a hard ball 
of radius a moving in a viscous medium. Us- 
ing the Schurr's approach, and accounting for 
the difference in sizes between the lac repressor 
aiac = 4.9 10~ 7 cm and the T7 RNA polymerase 
dRNAP ~ 7 10~ 7 cm, the upper limit of the 
polymerase diffusion constant would rescale as 
(see Ref. ( fBchurr, 1979 ) for details): Drnap — 
Diac {aiac/aRNApf ~ 1-54 1(T 9 cm 2 /s. The 
latter being compared to a "free diffusion" limit 
1 2 /(2t), I — 0.34 nm, sets the elementary time 
interval r = l 2 /(2D) « 3.8 1(T 7 s, during 
which a translocation to the nearest base pair 
may happen. 

Let us finally consider the distribution of en- 
ergy levels that is obtained when the real T7 
DNA sequence is considered and the energy 
landscape is evaluated through the local degree 
of homology by Equation (|TJ). In Fig. [|the en- 
ergy distribution evaluated on the whole T7 se- 
quence is represented. As can be seen by corn- 




Figure 4: Energy level distribution (models I 
to III), obtained by averaging on the whole T7 
DNA. A Gaussian fit of the resulting histogram 
{dashed line) is superimposed for comparison. 
Inset: The corresponding distribution for model 
IV (E t = 0). 



paring with the superimposed fit, the resulting 
distribution for models / to /// is almost Gaus- 
sian. Note that in the case of model IV all con- 
tributions to levels above the threshold E t are 
obviously condensed in a unique level E s i (see 
inset of Fig. ||). 

2 Results: recognition effi- 
ciency and anomalous dif- 
fusion 

The first important check of the four RNAP 
models is related to their affinity to the pro- 
moter region. Theoretically, one can easily esti- 
mate the stationary distribution of a population 
of polymerase on the four different model land- 
scapes as 



Poo W « e 



-E(n)/k B T 



(5) 



As usual, the stationary distribution only de- 
pends on the site energy, and not on differences 
and thresholds. Consequently, models / to 77/ 
have the same distribution, whereas the redefi- 
nition of energy in model IV leads to a substan- 
tially different result. Equation (||) straightfor- 
wardly implies that the recognition sites, which 
have the lower energy, will be in average the 
most populated. 

In order to verify that this is indeed obtained 
in a dynamical context, we simulated numeri- 
cally the time evolution of models I to IV taking 
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a uniform distribution of independent RNAPs 
on a DNA region of 1000 bps as initial condi- 
tion. Note that the assumption of an uniform 
initial distribution is statistically equivalent to 
considering the probability evolution of a sin- 
gle polymerase binding to DNA at random site. 
The simulation is performed on the first 3000 
base-pairs of the T7 sequence, which contains 
two recognition sequences GAGTC, at positions 
1126 and 1435. 
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the polymerase to the promoter recognition se- 
quences. 

Note that, in case of model IV, the distribu- 
tion of levels is different, this obviously implying 
a different shape for p^ (n) . The case of a suf- 
ficiently low threshold energy is reflected on an 
asymptotic distribution with rarer, larger peaks 
on a very low constant background (data not 
shown) . 

We now investigate the dynamical behavior of 
the four models, and check if there are some rel- 
evant deviations from random walk, induced by 
the sequence sensitivity. For large enough values 
of e/ksT, some positions along DNA could trap 
polymerase for long time, this implying that, at 
small and intermediate time, diffusion could be 
substantially different than for a pure random 
walk. In order to estimate this effect, we calcu- 
late the mean square displacement for the poly- 
merase: 

N 

(An 2 ) = (An 2 (t)) = ^(^(t) - n 4 (0)) 2 . (6) 



Figure 5: A central portion of the polymerase 
distribution p(n) for model I after an integra- 
tion time of 10 6 integration steps, obtained by 
averaging over 3 10 4 particles initially uniformly 
distributed in the interval [1000,2000] (solid 
line). The analytical equilibrium distribution 
Poo( n )' [dotted line) is shown for comparison. 
Here e/fcgT = 0.5. Inset: the whole distribution 
at the same time. In both plots, the arrows in- 
dicate the location of the recognition sequences 
GAGTC (sites 1126 and 1435). 



After a sufficiently long time, the polymerase 
distribution pin) spreads out, as shown in the 
inset of Fig. ||, and shows a series of peaks cor- 
responding to the sites with larger occupancy. 
Where the border effects can be neglected, this 
distribution tends to its equilibrium limit; this 
is shown in Fig. ||, where we plot a portion of 
the distribution obtained after 10 6 time steps 
for model /, together with p x . As expected, the 
larger peaks correspond to energy minima, i.e., 
to the location of the two recognition sequences 
GAGTC present in this DNA region. For all 
the models I to III the final distribution is sim- 
ilar, with the two highest peaks exactly in corre- 
spondence to the two recognition sequences, this 
confirming that the energy landscape defined on 
the basis of the pattern matching actually guides 



We average over N — 9 10 3 independent parti- 
cles, initially distributed uniformly in the DNA 
region [1000, 2000]. This procedure therefore in- 
cludes both average on a large number of parti- 
cles and on a large set of initial conditions. 
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Figure 6: Diffusion behavior of model I for dif- 
ferent values of e/kgT . From the upper curve to 
the bottom: e/k B T = 0, 0.3, 0.6, 0.9, 1.2, 1.5. 
Note the log-log scale: a linear diffusion 
(An 2 ) oc t corresponds in this graph to the 
straight lines of unit slope (solid lines), while 
slopes lower than 1 correspond to (An 2 ) = At , 
with b < 1. A (dashed) line of slope 0.3 is re- 
ported for comparison. 



Starting from model /, we investigate the de- 
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pendence of the diffusive behavior on e/ksT. 
Results are shown in Fig. |[ In the limit of 
e/ksT = 0, i.e., in the case of a flat potential 
(or T = oo), the diffusion is of course normal, 
with D = 1 and (An 2 (i)} = 2t, so that the 
corresponding curve is a straight line of slope 
1 in the log-log plot (upper curve on Fig. 
For larger values of e/ksT (smaller tempera- 
tures compared with the energy fluctuations), 
the dynamics of the model shows initially large 
deviations from the normal diffusion: in these 
finite temperature cases, the motion is initially 
subdiffusive, with 



(An 2 ) = At b , 6< 1. 



(7) 



The exponent b increases monotonically with 
time towards its asymptotic value 1. The ini- 
tial deviation (1 — 6) and the crossover to 6 = 1 
both increase with e/kgT. This behavior does 
not depend on the choice of the initial condi- 
tion and it is not a transient induced by some 
t = properties: we have verified indeed that 
qualitatively the same time dependence is repro- 
duced after an initial transient time of 10 4 , 10 5 
or 10 6 time steps. As expected, once reached the 
normal diffusion regime, different temperatures 
correspond to different diffusion constants D (in 
the log-log representation, 2D corresponds to 
the vertical offset of the lines of slope 1, accord- 
ing to the relation log(An 2 ) = log2_D + logi). 

Plots of Fig. H also give a measure of the slow- 
ing down in the promoter search induced by the 
sequence-dependent interaction. Indeed, in the 
log- log plot the horizontal offset, at a given An 2 , 
between different curves corresponds to the log- 
arithm of the ratio between the time needed to 
cross the corresponding displacement An for dif- 
ferent choices of e/ksT. Therefore, if An is a 
typical distance to promoter, the horizontal off- 
set just gives the slowing factor induced by sub- 
diffusion with respect to normal diffusion. Re- 
ferring to Fig. |[ we can conclude that, if the 
distance to promoter is larger than 100 bps (so 
that An 2 = 10 4 ), then the time to reach the pro- 
moter should be reduced with respect to stan- 
dard diffusion roughly of a factor 10 for the case 
e/k B T = 0.6, of a factor 100 for e/k B T = 0.9. 
Furthermore, this slowing factor does not de- 
pend on An, provided that it is large enough 
to consider the asymptotic regime. In this hy- 
pothesis, it is possible to obtain an analytical 



i 

V 




estim ation of the slowing factor (Barbi et al. 
2002| ). 



Figure 7: Mean square deviation (An 2 ) for the 
four different models, with e/ksT — 1 and 
Et = 0, in the log-log representation. Symbols 
refer respectively to: open circles, model I; tri- 
angles, model II; diamonds, model III; squares, 
model IV (E t — 0). The straight lines corre- 
spond to the fit in the last part of the graphs 
(t G [6 10 6 ,10 7 ]). Inset: the same curves in a 
linear representation in the short time regime 
(symbols have the same meaning). 



We will now extend the diffusion analysis to 
the other versions of the model, introduced in 
Section [y. Resulting curves for models I to IV 
and for e/ksT = 1 are presented in Fig. [?]. As 
for model J, in all cases we observe at short time 
a subdiffusive regime due to the trapping effect 
of the rough energy landscape. 

The initial values of 6, fitted in the time range 
(0,100) through the function At b , are the fol- 
lowing for the first three models: 

7:6 = 0.49 ± 1% 
II: b = 0.61 ± 1% 
III: 6 = 0.56 ± 1% . 

Note that model IV displays in this short time 
regime a particular behavior, that will be dis- 
cussed in the following. 

Let us remark that, in principle, the ob- 
tained anomalous diffusion could be due to some 
particular spatial correlation properties of the 
underlying potential. Nevertheless, we have 
checked that it is only due to the roughness of 
the landscape, doing the same experiment on 
an artificial base sequence, completely random. 
In the conditions described by model I, for in- 
stance, and in the same fit range, we obtained 
6 = 0.52 ± 1%, and a curve similar to T7 DNA 
case (data not shown) . 
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Figure 8: Behavior of the exponent b as fitted in 
the short time regime t € (0, 100) as a function 
of the threshold energy Et for model III. The 
vertical line corresponds to max[£'(n)] = 5e. 



We then studied the behavior of the short 
time subdiffusive exponent & as a function of 
E t for model 777 with varying threshold (i.e., in- 
cluding model 7 and 77) . The results are shown 
on Fig. |[ For threshold lower that a critical 
value of about — 3e the system displays almost 
no sensitivity to the threshold level. Indeed, 
This is due to the fact that, some relevant ef- 
fect, it is necessary to have not only a site n 
with E(n) < Et, but also at least two neighbor- 
ing sites should be below the threshold in order 
to feel its effect (see Eq. [|). The probability of 
finding two adjacent sites below the threshold is 
too low below Et < — 3e, thus explaining the ob- 
served insensitivity. Interestingly, the exponent 
b becomes a nonmonotonic and very sensitive 
function of E t for larger values of E t . The effect 
of the threshold in this intermediate regime is in 
fact twofold: from one side, it induces an addi- 
tional damping on many low energy sites; from 
the other, it makes (a fraction of) these same 
sites "blind" to the energies of their neighbor- 
ings (the translocation barriers only will depend 
on E(n) and Et). The complex balance between 
the two contributions induces the high instabil- 
ity of the fit results displayed in Fig. [|. As the 
threshold increases above the maximum level 
(Et = 5e) , the disorder of the underlying energy 
landscape becomes less and less important, and 
the system tends to recover a standard diffusive 
behavior strongly damped, i.e., with b — ► 1 and 
A^0. 

Now let us consider the large time limit. The 
asymptotic diffusion constant depends on the 



model choice. A linear fit of the large time 
regime of (An 2 ) of Fig. ^ has been done in or- 
der to estimate the average diffusion constant 
D, in the random walk approximation where 
(An 2 ) = 2Dt. Besides, we checked that an effec- 
tive linear behavior is reached in the correspond- 
ing time range by fitting again with a function 
(An 2 ) — At b and verifying that b is close to 
unity. The resulting diffusion constants D and 
the exponents b for the four models at large time 
(t 6 [6 10 6 ,10 7 ]) are given, for E t = 0, respec- 
tively by: 
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The corresponding fits are the straight lines in 
Fig. @. 
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Figure 9: Behavior of the coefficient 2D as fitted 
in the large time regime t € (8 10 5 ,10 6 ) as a 
function of the threshold energy E t for model 
777. The level max[7?(n)] = 5e is represented by 
a vertical line. 



The differences in the equilibrium diffusion 
constant between different models are explic- 
itly related to the activation barrier in the four 
cases: the higher is the threshold to overcome 
in order to move one step, the lower is the dif- 
fusion constant. Note that in the case of model 
IV the boundaries between flat and rough re- 
gions act as energy barriers of amplitude ss E s i : 
these barriers appear to affect the motion more 
strongly than the threshold E t , this resulting in 
a diffusion constant closer to that of model 77 
than to that of model 777. 

We have analyzed the dependence on E t also 
for the asymptotic diffusion constant D. Fig. 
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shows the dependence of D on E t in model 
/Jig A gain, almost no sensitivity to the thresh- 
old level is observed below a critical value, ap- 
proximatively E t = — 3e. Roughly, between 
this value and E t — 0, we observe a transi- 
tion to a regime of strong sensitivity (E t > 0) , 
where the damping effect induced by the thresh- 
old is much more enhanced. The diffusion con- 
stant decreases rapidly above the maximal en- 
ergy {Em — 5e, vertical line), as intuitively ex- 
pected. 




Figure 10: Time behavior of {An 2 ) for 
model IV, in the cases E t — —4 {full squares), 
E t = — 2 {circles), E t = {full triangles), E t — 2 
{diamonds), and with e/ksT = 1. Two straight 
lines of slope 1 are shown for comparison. 



We shall now discuss in detail model IV, 
since it displays, with respect to the others, a 
more complicated behavior. Note that, in prin- 
ciple, model IV can be put exactly in the same 
scheme as the other models, once the underlined 
potential E{n) is redefined according to Equa- 
tion H). Nevertheless, this redefinition of the 
energy landscape leads to substantially differ- 
ent features. As can be observed in Fig. [Io|, 
during an initial time interval the polymerase 
diffuses more rapidly, even if still subdiffusively, 
with initially a larger effective diffusion con- 
stant. The initial speeding up of the dynamics 
becomes more pronounced as the value of the 
threshold decreases, i.e., as the energy redefini- 
tion involves an increasing number of sites. This 
effect can be explained by considering how the 



5 For technical reasons, we display data resulting from 
the fit in the range (810 5 ,10 6 ), i.e., in a region where 
the parameter b has not yet reached unity. The curve 
of Fig. ^ represents therefore only a qualitative analysis 
and^hows some small discrepancy with data given in 
Eq. 



potential landscape is changed for model IV. 
Among the particles, uniformly distributed at 
time zero over a large region of the sequence, 
all those that are initially on flat regions of en- 
ergy E s i will start diffusing freely with diffusion 
constant equal to 1, until they fall down in one 
E < E t region. These particles contribute ini- 
tially to the diffusion with a large term, thus 
making it increase. After an initial transient, 
however, most of the particles will be almost 
trapped in the potential wells, and the effective 
diffusion coefficient will decrease accordingly. 

More precisely, the trapping effect will depend 
on the value of E s i, set to max. [E{n)] in our 
calculations. If E s i is big enough, most of the 
particles will be trapped in E{n) < Et regions, 
with activation barriers and only a small proba- 
bility to escape again toward the flat plateaux. 
Therefore, in the long time regime, the system 
will be essentially in the same state model III, 
but mostly localized in some finite regions. In 
other words, the particular equilibrium condi- 
tions introduced in model IV are indeed such 
that one particle needs to spend a large amount 
of energy (and, therefore, of time) before reach- 
ing a high level plateau, but once reached, it 
can move much faster to the next favorable site. 
An analytical derivation of the main dynamical 
quantities as functions of the model parameters 
discussed in this section will be presented else- 
where flBarbi et al, 2002| ). 



3 Discussion 

All the results presented in this work can be 
checked by a comparison with detailed experi- 
mental data. As mentioned in the introduction, 
experiments leading to a rather precise deter- 
mination of the RNAP position along DNA at 
different times during the promoter search have 
already appeared (Kabata et al., 1993; Harada 
et al, 1999] ; puthold et al 1999\ )~knd others 



are in progress (Place, 2002). This will give for 
the first time the possibility to estimate the de- 
tailed features of the T7 RNAP diffusive mo- 
tion. As we have shown, a dynamical model 
which includes both the affinity for the promoter 
together and the possibility of sliding, leads to 
a nontrivial sequence dependent dynamics, at 
least in some range of the parameters. It is 
thus important to verify if these effects can ac- 
tually be observed experimentally. The sliding 
distance is kinetically evaluated in different ex- 
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periments around 350 — 1000 bps ((Shimamoto 



1999) and references therein). This is probably 



not peculiar to RNAP since other enzymes also 
seem to slide along the DNA covering a short 
distance of about 300 bps before being released 



in solution (Stanford et al, 2000a). In this space 
scale, the anomalous diffusion behavior is pre- 
dominant for our model. 

In particular, the recent scanning force mi- 
croscope (SF M) experiment, perform ed by Gut- 
hold et al ( |Guthold et al, 1999 ), allows for 
a direct observation of one E. coli RNAP slid- 
ing back and forth on a single DNA chain par- 
tially adsorbed on a mica surface, although with 
some technical limitations (the average lifetime 
of the nonspecific complex is more than hun- 
dred times larger than what measured in solu- 
tion, probably due to the two-dimensional con- 
straints). The statistical properties of the ob- 
served diffusive motion have been fitted by the 



law (Aa 



2Dt, in order to confirm the gen- 



eral assumption that RNAP moves randomly 
along DNA (( puthold et al, 199S| ), Fig. 2). 
Quantitatively, however, in the observed dis- 
placement ranges (less than two hundreds base- 
pairs), the corresponding data seem to deviate 
from a pure diffusive motion. This may be due 
to the experimental constraints and to the lim- 
ited number of RNAP sliding trajectories (about 
30). On the other hand, the rough estimate 
of numerica l data from Fig. 2 of Ref. dGutholj 
et al, 1999 ), fitted with a power law of the type 
At b , gives b ~ 0.5 ± 15%. It is very interesting 
to note that these data seem much more com- 
patible with a subdiffusive behavior than with 
normal diffusion, as is usually assumed. This 
first experiment allowing for a direct visualiza- 
tion of the RNAP sliding motion gives therefore, 
from our point of view, intriguing and encour- 
aging results. 

We remark that the dynamical features de- 
scribed here depend crucially on the choice of 
the model parameters: the ratio e/ksT, the 
value of the energy threshold E t , and, in the case 
of model IV, the energy of the plateaux E s i . As 
a first check, we can try to compare our rough 
estimation of the power exponent we estrapolate 
from the results in Ref. ( puthold et al, 1999 ) 
with the behavior of the model as a function 



Further experimental investigations, devoted 
to the detailed determination of the nonspe- 
cific interaction, are necessary to improve the 
model. The version of the model which is com- 
patible with the sliding RNAP dynamics of sin- 
gle molecule experiments should emerge from 
comparison with the experimental data, using 
the model parameters as fitting parameters. In 
practice, the complicated diffusive behavior of 
the model will allow us to compare theory and 
experiments by means of more than one dynam- 
ical observable. For the case of model IV, where 
the additional model parameter E s i is needed, 
the presence of a new short-time specific feature 
could be used in the fit of the experimental re- 
sults. 

From a biological point of view, the four mod- 
els offer a framework for defining the pertinent 
parameters to optimize the promoter search. 
For all models, the specific interaction energy e 
between RNAP and DNA is crucial and should 
be close to fc^T in order to allow the polymerase 
to move. This adjustment of the interaction en- 
ergy can be achieved by varying the distance 
and angle of the H-bonds during sliding. Per- 
haps, the more interesting model from a bio- 
logical point of view is model IV , since it al- 
lows for a better control of the diffusion pattern, 
and consequently for the corresponding biologi- 
cal function. An exact balance has to be found 
in biological system between the reading and 
sliding mode. E t , E s i, and e/ksT have to be 
optimized for the biological purpose which will 
be physically reflected by the protein-DNA in- 
teraction and by the DNA sequence. 

Finally, it is important to keep in mind that 
the recognition mechanism through hydrogen 
bonds considered here does not allow for a com- 
plete identification of the promoters. The recog- 
nition sequence GACTC (or of the complemen- 
tary sequence GAGTC) appears in the com- 
plete T7 genome more than 90 times; however, 
only 10 of them actually belongs to 17 bps long 
promoters. Evidently, other "signals" cooper- 
ate with the direct pattern recognition mecha- 
nism in order to allow the polymerase to find 
its target. The weak sequence TAATA (posi- 
tions -13 to -17), for instance, also interacts 



with RNAP through the minor groove (Chee 



of e/fc^T. The value of about 0.5 very roughly tarn et al, 1999). A sensitivity to this mi 



corresponds to e/fc^T w 1 for all values of E t , 
this confirming that the parameter choice made 
in the most part of our simulations could be in- 
deed of the right order of magnitude. 



nor groove region should probably be included. 
In this sense, our model represents a first at- 
tempt towards a detailed description of the 
RNAP dynamics during the promoter search. 
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The model can also be extended to the case 
of other enzymes by a detailed introduction of 
their sequence-dependent interaction with non- 
specific DNA. We believe that the main idea of 
the model, which is the link between base se- 
quence and enzyme dynamics, will be valid in 
general. Indeed, as far as a sequence depen- 
dence is considered, the enzyme will always in- 
teract with DNA through an effective poten- 
tial with a fluctuating profile. This potential 
should be induced for different enzymes by dif- 
ferent kinds of interaction. Its roughness by it- 
self, however, will always generate anomalous 
diffusion features as those described in this pa- 
per. 



Conclusions 

In this paper we have proposed a simple model 
for the RNAP sliding motion along DNA, which 
includes a sequence dependent interaction. We 
deduced an hypothetical polymerase-DNA in- 
teraction from the crystallographic structure of 
the T7 polymerase-promoter complex ( Chectam 
et at, 1999). We have included four possi- 



ble variations by considering slightly different 
translocation probabilities, i.e., by the presence 
of a variable activation barrier E t (leading to 
models I to III) , and eventually by distinguish- 
ing "reading" regions from "sliding" regions, 
where no hydrogen bonds are made so that the 
RNAP can freely diffuse on an effective constant 
potential (model IV). 

A numerical study of the diffusion properties 
of the four versions of the model shows that a 
normal diffusion regime is only achieved after 
some time. We have shown as all the four mod- 
els are characterized at shorter times by a sub- 
diffusive behavior. A rough estimation of the 
slowing factor induced by the sequence depen- 
dence for different values of the energy parame- 
ter can be easily obtained. This result is of par- 
ticular interest because, as we have discussed, 
the anomalous diffusion is observed in a range 
that corresponds approximatively to the exper- 
imentally observed characteristic distance cov- 



ered by the RNAP during sliding (Shimamoto 



1999). The physical reasons underlying the dif- 



ferent diffusion behaviors have been discussed. 

Nowadays the existing nano-technologies and 
single molecule techniques allow for constrain- 
ing and manipulating single biological objects. 
The present paper represents a first step towards 



theoretical picture where some of the resulting 
experimental results could be analyzed and con- 
nected with the known functional properties of 
the corresponding biological systems. It is im- 
portant to keep in mind, anyway, that the in 
vivo dynamics of the corresponding biological 
processes occurs in a high density environment, 
in presence of very complex spatial structures 
and of water molecules mainly bound and struc- 
tured dGoodsell, 1992| ). What we usually call 
the diffusive motion of proteins inside the cell 
is likely to be instead a motion strongly depen- 
dent on a complex set of environmental trap- 
ping sites, as in the case considered here. Also 
in this respect, the approach proposed in this 
paper may have a larger range of application. 
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